Cache append-only Trace turn counts#1820
Conversation
ApprovabilityVerdict: Needs human review This caching optimization has an unresolved review comment identifying a potential correctness issue where in-place node mutations could cause stale cache values. The concern is substantive enough to warrant human review to validate the 'append-only' assumption. You can customize Macroscope's approvability policy. Learn more. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f964fcdb1a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if node_count == counted_nodes: | ||
| return num_turns |
There was a problem hiding this comment.
Invalidate turn cache for same-length node changes
When trace.num_turns has been read once, this early return makes the cache depend only on len(trace.nodes). Because nodes is a public mutable trace field, any same-length node replacement or sampled-flag correction leaves num_turns stale; for example, after changing one node from sampled=False to True, turn-limit checks and metrics continue to see the old count until the list length changes. Either invalidate the cache on node assignment/mutation or avoid the cached fast path when append-only mutation cannot be guaranteed.
Useful? React with 👍 / 👎.
The base branch was changed.
f964fcd to
141d443
Compare
Overview
Cache
Trace.num_turnsincrementally so turn-limit and interception reads scale with newly appended graph nodes rather than the full rollout history.Design
The message graph is owned by
Trace.nodes, and normal graph commits append nodes. The trace now keeps a private(counted_nodes, sampled_turns)cache:The cache is a Pydantic private attribute, so wire and disk serialization remain unchanged and deserialized traces begin with a cold cache. Turn semantics are also unchanged: only nodes marked
sampledcount, independent of branch topology. This does not restructure the graph or alter interception control flow.For a progressive rollout with a count read after every append, the prior expression visits
N(N + 1) / 2nodes. The incremental path visits each appended node once, reducing that work from quadratic to linear.Performance
Measured on 20,000 progressive sampled turns with separate old/new processes and three uninstrumented timing repetitions:
Resource measurements stayed effectively flat:
tracemallocpeaktracemallocretained after trace deletionThe incremental implementation performs no suffix-list copy; it walks newly appended nodes directly by index.
Note
Low Risk
Localized performance change to a derived property with explicit cache invalidation on list shrink; no API or serialization changes.
Overview
Trace.num_turnsno longer rescans allnodeson every read. It keeps a private(counted_nodes, sampled_turns)cache and only walks newly appended nodes; ifnodesshrinks, it rebuilds from scratch.Turn counting semantics are unchanged (only
samplednodes count). The cache is a PydanticPrivateAttr, so serialized traces are unaffected and start cold after load.This targets hot paths that read turn count each graph commit—e.g.
max_turnschecks in interception and eval dashboards—cutting repeated work from quadratic to linear over a long rollout.Reviewed by Cursor Bugbot for commit 141d443. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Cache
Trace.num_turnsincrementally for append-only message graphsThe
num_turnsproperty in trace.py previously summed over all nodes on every call. It now uses a_num_turns_cachetuple(counted_nodes, sampled_turns)to track progress incrementally, only iterating over newly appended nodes. If the node list shrinks, the cache is rebuilt from scratch.Macroscope summarized 141d443.